Concerning the accuracy of MAST E-values
نویسندگان
چکیده
A recent article on the IMPALA algorithm by Schäffer et al. (1999) gives the misleading impression that the E-values returned by the MAST algorithm (Bailey and Gribskov, 1998) are unreliable. This erroneous conclusion was reached due to an inappropriate protocol in the experiment involving MAST in the IMPALA article. We have tested MAST using a redesigned protocol that causes MAST to behave as the IMPALA authors intended. This test reveals that MAST E-values are just as accurate as those reported by two of the other search algorithms studied in the IMPALA article, and nearly as accurate as those reported by the IMPALA algorithm. Superficially, MAST and IMPALA appear to be designed to perform complementary operations, but they are not. MAST is designed to search a target database of protein or nucleotide sequences for matches to a query consisting of a set of position specific scoring matrices (PSSMs). MAST treats the set of PSSMs as a single signature describing, for example, a protein family or regulatory region. Consequently, for each sequence in the database, MAST computes a single similarity score that combines the similarity of the sequence to each of the PSSMs in the query set. On the other hand, the IMPALA algorithm is designed to search a database of PSSMs for matches to a protein or nucleotide query sequence. Unlike MAST, however, each PSSM is treated as being the signature of a separate family, and scores from different PSSMs are never combined by IMPALA. Thus, using MAST to search a database consisting of a single sequence using a set of PSSMs as the query has an entirely different effect than using IMPALA to search the set of PSSMs using the sequence as the query. The experimental protocol used in the IMPALA article does not run MAST in the way the article’s authors intended, and virtually guarantees that the E-values will not be accurate. To measure E-value accuracy, the article’s protocol runs 467 MAST searches. In each case, the query is the entire set of wolf1187 PSSMs (converted to MAST input format), and the target is a single sequence. Since all of the PSSMs were placed into a single file, MAST treated them as the signature of a single protein family, rather
منابع مشابه
RANDOM FUZZY SETS: A MATHEMATICAL TOOL TO DEVELOP STATISTICAL FUZZY DATA ANALYSIS
Data obtained in association with many real-life random experiments from different fields cannot be perfectly/exactly quantified.hspace{.1cm}Often the underlying imprecision can be suitably described in terms of fuzzy numbers/\values. For these random experiments, the scale of fuzzy numbers/values enables to capture more variability and subjectivity than that of categorical data, and more accur...
متن کاملThe evaluation of the number of mast cells in oral lichen planus
oral lichen planus is diagnosed according to clinical and histopathological characteristics.some times all of the classical histopathological features can not be seen.in some articles the presence of mast cells have been noted below the epithelium of oral lichen planus patients.the following study was done to evaluate the number of mast cells in the histopathological sections of 20 oral lichen...
متن کاملMast cells and histopathologic variants of basal cell carcinoma
Background and aim: The stroma of basal cell carcinoma (BCC) is rich in mast cells. Morpheaform BCC has been reported to contain much more mast cells than the other types of BCC, but their significance remains unknown. In this study we investigated the significance of mast cells related to BCC and possible relationship between increased number of mast cells and clinical and histological p...
متن کاملDoes routine repeat testing of critical laboratory values improve their accuracy?
Background: Routine repeat testing of critical laboratory values is very common these days to increase their accuracy and to avoid reporting false or infeasible results. We figure that repeat testing of critical laboratory values has any benefits or not. Methods : We examined 2233 repeated critical laboratory values in 13 different hematology and chemistry tests including: hemoglobin, white...
متن کاملP-88: Mast Cells Number and Distribution and Myoid Cells Population Are Impressed by Sulpiride Antipsychotic Drug
Background: Spermatogenesis is under the control of variety of factors, which include mast cells. Any changes in testicular mast cells’ number and distribution resulting in deterioration of comlex regulation and fine-turning of spermatogenesis. Increased number of mast cells and their abnormal spread pattern have been described in the testis with fertility disturbances. Mast cells’ tryptase enz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 16 5 شماره
صفحات -
تاریخ انتشار 2000